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Abstract 

We consider the problem of finding, for a given quadratic measure of non-unifor- 
mity of a set of points (such as L2 star-discrepancy or diaphony), the asymptotic 
distribution of this discrepancy for truly random points in the limit N ^ oo. We 
then examine the circumstances under which this distribution approaches a normal 
distribution. For large classes of non-uniformity measures, a Law of Many Modes in 
the spirit of the Central Limit Theorem can be derived. 
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1 Introduction 

In the field of numerical integration, there are two aspects of the general problem which 
bear on the accuracy of the numerical result. The first is of course the behaviour of the 
integrand: typically, wildly fiuctuating functions are integrated with less accuracy than 
relatively smooth ones, for the same number of integration points. The second one is 
the distribution of the set of points at which one evaluates the integrand. It stands to 
reason that, if one has no a-priori knowledge of the integrand, a set of points that is 
fairly uniformly distributed may be expected to do better than one in which many points 
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cluster together. It is therefore useful to quantify and study the notion of 'uniformity of 
point sets', and this has been the topic of a great number of publications JI], |[. The most 
important of such notions are those of the star- discrepancy and L2 star- discrepancy , and 
more recently other measures of non- uniformity that go under the name of diaphony have 
been introduced as well [Q]. In this paper, we shall call all such measures 'discrepancies'. 

As has been shown in Ref. [^, |^, the use of a particular discrepancy in assessing the 
uniformity of a given point set implies that one has some notion of the generic behaviour of 
the integrand: it is tacitly assumed that the integrands to be attacked belong to some class 
of functions. The particular discrepancy is then recognized as the average-case complexity 
of the integration problem over that function class 0, 0|. 

While the Monte Carlo method, in which the integration points are chosen at random, 
has long been recognized as a robust and useful way of evaluating multivariate integrals, 
its relatively slow convergence has inspired a search for other point sets whose discrepancy 
is lower than that expected for truly random points. Such low- discrepancy point sets 
and low- discrepancy sequences have developed into a veritable industry, and sequences 
with (asymptotically, for large A^) very low discrepancy are now available, especially for 
problems with very many variables ||^. For point sets that are extracted as the first 
N elements of such a sequence, though, one is usually still compelled to compute the 
discrepancy numerically, and compare it to the expectation for random points in order to 
show that the point set is indeed 'better than random'. This implies, however, that one has 
to know, for a given discrepancy, its expectation value for truly random points, or preferably 



even its probability density. In Refs. g |T^, 0, |T2| we have solved this problem for large 



classes of discrepancies. Although computable, the resulting distributions are typically not 
very illuminating. The exception is usually the case where the number of dimensions of the 
integration problem becomes very large, in which case a normal distribution often arises 
1^, In this paper, we investigate this phenomenon in more detail, and we shall describe 
the conditions under which this 'law of large dimensions' applies. 

The layout of this paper is as follows. In section 2, we define the general structure of a 
discrepancy related to a class of integrands of which it is an average-case complexity. We 
show how to derive the probability density of this discrepancy when viewed as a stochastic 
variable defined on sets of truly random points. Then, we investigate the conditions under 
which this density approaches a normal density. Finally, in section 3 we apply our results 
to a few toy-models and standard choices of discrepancy. A number of technical points are 
collected in the various Appendices. Throughout this discussion, we shall only consider 
the asymptotic limit of a very large number of integration points. This implies that, in 
this paper, we cannot make any statements on how the number of points has to approach 
infinity with respect to the number of dimensions, as was for instance done in Ref. [p!3[ . 
In Ref. we repair this defect, and shall be able to show which precise combination of 
limits has to be taken. 
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2 General definitions and statements 



To set the stage, we shall always consider the integration region to be the s-dimensional unit 
hypercube K = [0, 1)"*. The point set consists of N points x^, where A; = 1, 2, . . . , iV 
labels the points and /x = 1, 2, . . . , s their co-ordinates. 



2.1 Quadratic discrepancy and complexity 

We will define quadratic discrepancies as the average-case complexity of an integration 
problem in terms of its averaged squared integration error 0]. For the given class of real- 
valued functions /(x), with x & K, let a measure dfj,{f) on the class of functions be given, 
such that the one- and two-point connected Green's functions are given by 

/(xi)/(x2) rf/x(/) = / h{y;xi)h{y;x2)d^{y) . (1) 



L 



Here we assume, that we can define a function h{y, x) and a measure dii{y) over some 
space L such that the above expression makes sense. The variable y has to be suitably 
defined; it may be a continuous variable with a continuous integration measure dfi{y), or a 
discrete variable, in which case / dfj,{y) reduces to a sum over an enumerable set of discrete 
values, such as a lattice: all cases we will consider in this article can be expressed in terms 
of an enumerable set of discrete values. For the moment we will stick to the more general 
notation of dfi{y). we define the quadratic discrepancy^ _Djv as follows 0]: 



D 



N 



N 



[/] dfxif) , (2) 



1 ^ r 

In Ref. [0 it was shown that, if the function measure dfi{f) is Gaussian in the sense 
that the only non- vanishing connected Green's function is the two-point function, then the 
integration error will be normally distributed with zero mean and variance equal to Dn/N. 
The discrepancy Dj^j can be written as 



kl=l 



(3{xk,xi) = / uj{y]Xk)uj{y]Xi) djiiy) , uj{y;Xk) = h{y;Xk)- / h{y;x)dx. (4) 
'l Jk 



^Note that we have taken a factor N out of the definition of the discrepancy compared to other 
definitions in the hterature. This has the advantage that the discrepancy averaged over the ensemble of 
truly random point sets is independent of N. 



3 



In fact, D]^ measures how well the function h{y] ■) is integrated by the point set Xat, 
averaged over y. Notice that D^v is nonnegative by construction, and that for an infinite 
equidistributed sequence, liniAr^oo -DAr/iV = 0. Moreover, the expected value of Dn for a 
set of truly random points in K is given by 



^[Dn] = [ V[/]rf/i(/) = / / uj{y;xydxdfi{y) , (5) 

J J L J K 



where E [■] denotes the expectation value w.r.t. the uniform distribution over the ensemble 
of truly random point sets with points, and V[f] is the variance of the function /(■). We 
shall always assume this expectation value to be a finite quantity, otherwise this discrepancy 
cannot meaningfully be used for truly random points. 

In our approach to the calculation of discrepancy distributions, we will also use the 
higher momenta E [D^ff\ (m = 1, 2, 3, . . . ), which therefore have to be assumed to be finite^. 
We will also define some useful functions [3k and r^: 



l3k{xi,X2) = / (3{xi,x)(3k-i{x,X2)dx , (6) 
Jk 

r(?/i,?/2) = / uj{yi]x)uj{y2;x)dx , (7) 
Jk 

^kiyi,y2) = j T{yi,y)Tk-i{y,y2)dfx{y) , (8) 

with (5i = (3 and Fi = F. The function F is in a certain sense dual to the function (3. It 
will be more convenient to use, because the variable y is often an element of a countable 
set and F can then be viewed as a matrix, with Tk{yi,y2) = ^{yi,y2Y- 

2.2 Gaussian measures on a countable basis 

In this paper, we shall consider function classes with functions / that can be written as 
linear combinations of a countable set of basis functions {un}'- 

fi^) = ^VnUnix) . (9) 

n 

Often we will refer to the basis functions as modes. We assume that integrals over combi- 
nations m„j(x)m„2(x) ■ ■ ■ exist and introduce the parameters 



= I Un{x) dx and a^.n = / Um{x)un{x) dx . (10) 
Jk Jk 

The variance of / can then be written as 

= / f{xfdx-{ / f{x)dx\ = t;mW„ (a^,„ - . (11) 

Jk \Jk J 



m,n 



^For the discrepancies we discuss, this is a vahd assumption. 
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A Gaussian measure on the class of functions is obtained by taking 

Mf) = n ''"^'"' -^-. ■ (12) 

For the measure to be suitably defined, the strengths cr„ have to satisfy certain restrictions. 
In particular we want the functions / to be quadratically integrable on the average. The 
reasonable requirement that E [-D^r] must exist ensures that the variance of the functions 
/ exists on the average and thus imposes a condition on the strengths: 



E [D^] = Jv[f] d^^{f) = J2 [Un] 



(13) 



Now we can use the formalism of the previous section to construct the discrepancy. The 
two-point connected Green's function is given by 

/ fiXi)f{x2)dfl{f) = ^alUnixi)Un{x2) , (14) 

which is nothing but a spectral representation. The functions h and cu can be taken equal 
to 

hn{x) = CTr^Unix) , U;„(x) = (7„(li„(x) - W^) , (15) 

where the variable y is replaced by the countable index n. The function /3 and the matrix 
r are given by 

P{xi,X2) = '^al{Un{xi) - Wn){Un{x2) - Wn) , (16) 
n 

Note that we have for the trace of Vm,n'- 

T:(r) = J]a^VK] = HDn] • (18) 

n 

2.3 General form of discrepancy distributions 

We now turn to the problem of computing the probability density of such a discrepancy 
when the points are (independently and uniformly) randomly distributed over K. In- 
troducing the Dirac ^-distribution and its representation as a Laplace transform, we may 
write the probability density H{t) for the value t of discrepancy = Dn{xi,X2, ■ ■ ■ ,xn) 
as 



H{t) = f f ■■■ f S{Dn{xi,X2,... , 
Jk jk Jk 



xn) — t) dxidx2 ■ ■ ■ dxN 



1 



= 2^ / e-''Go{z)dz , (19) 
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where the z integration runs along the imaginary axis, and Go (-2) is the moment-generating 
function 

G,{z) = E[e^^-] = Y.^_^™ ■ (20) 

m>0 

At this point it may be useful to note that, since D^r is nonnegative by construction, we 
must have H{t) = for t < 0, and hence no singular point of ^0(2;) may have a negative 
real part. 

The task is, now, to compute Gq{z) as a series expansion around z = 0. In Refs. p|. 



T0| , [1^ we have shown how Feynman diagrams may be usefully employed to do this in 



a systematic way in the limit of large N. In this paper we shall restrict ourselves to the 
leading behaviour iV — >^ 00, in which limit we have 

log(Go(^)) = y]^^fc , Rk = I f3k{x,x)dx = [ Tk{y,y)dij{y) . (21) 
fc>0 '^'^ 

In those cases where the y variables are discrete and enumerable, F can be written as a 
real symmetric matrix, and then we simply have 

Go{z) = (det(l-22F))-^/=^ , Rk = Tr (F'^) . (22) 

We shall - symbolically - employ the matrix and trace notation for the continuous case as 
well. In general, we have 

T,iy^,y2) = A^y^.y^) - B{y{)B{y2) , 



A{yi,y2) = / h{yi]x)h{y2]x)dx , B{y) = / h{y;x)dx . (23) 
Jk J k 

In many cases (c/. the case of orthonormal functions bases), we have B{y) = 0, but this 
is not necessary. In general, then, Tr (F'^) consists of 2^^ terms. However, as shown in 
Appendix A, we can combine them nicely and arrive at 



Goiz) = exp{1p{z))/^/x{z) 

2zy 

2k 



fc>0 

X{z) = 1 + ^(2^)^= Tr . (24) 

fc>0 



2.4 Standardized variables and the Gaussian limit 

We now have derived the expression for Go{z) in the large- limit. Given the form of 
r(yi,y2); we can now compute H{t) for given discrepancy t, if only numerically; in fact 
this was done for the L2 star-discrepancy in Ref. ^ for several dimensionalities. In some 
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special cases, H{t) can even be given in more-or-less closed form |T0|, [TT|. Here, however, 
we are interested in possible Gaussian limits, and therefore it is useful to replace the value 
t of the discrepancy by the standardized variable ^, as follows: 

t = m = E[D^]+eVV [Dn] , (25) 

where the expectation E [D^] and variance V [D^] of the discrepancy (which equal Ri and 
2R2, respectively) are taken out such that the stochastic variable C, always has expectation 
zero and variance 1. By furthermore going over from z to u = z/\/2R2 in Eq. ([T9|) , we can 
write the probability density H{^) of ^ as 

^(0 = ^^(^(0)-"^'^^^ 



k 

k>3 



27ii J ^ \2' ^ k 

-ioo \ ^>3 

7k = Rl/Rl . (26) 



All information on the particulars of the discrepancy are now contained in the constants 
7/c, and we have that the probability density of ^ approaches the normal density whenever 
Ik ^ ^ for all k > 3. It remains to examine under what circumstances this can happen. 

2.5 A Law of Many Modes 

Let us assume, for the moment, that the matrix F is indeed a real symmetric matrix, for 
instance the case of Gaussian measures on a countable basis. Moreover, since we know 
that Go{z) has no singularities for negative values of Kez, the eigenvalues of F are also 
nonnegative, and we may write 

n \ n / \ n / 

where the various eigenvalues have been denoted by A„. Note that the sum may run over a 
finite or an infinite number of eigenvalues, but all these sums must converge since E [D^] 
is finite. Note, moreover, that 7^ is homogeneous of degree zero in the A„: therefore, any 
scaling of the eigenvalues by a constant does not influence the possible Gaussian limit 
(although it will, of course, affect the mean and variance of -Dat). 
We now proceed by noting that 7^+1 < 7^, because 
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where the first inequahty is simply the Schwarz inequahty, and the second one holds because 
the A„ are nonnegative. This means that 7^ will approach zero for k > 3, whenever 73 
approaches zero. To see when this happens we define 

Xn = , X = maxxn , (29) 

V ^—^in m 

SO that Xln^n = 1- It is then trivial to see that 

x^ < < X , (30) 

from which we derive that the necessary and sufficient condition for the discrepancy dis- 
tribution to approach a Gaussian is that 

C = „ , „ ^ , A = maxA„ . (31) 

n 

The Gaussian limit is thus seen to be equivalent to the statement that even the largest 
eigenvalue becomes unimportant. Clearly, a necessary condition for this is that the total 
number of no n- vanishing eigenvalues (number of modes) approaches infinity. Incidentally, 
the condition (pTF) also implies that 



A ^ , J2^n ^ , (32) 

n 

for all those discrepancies that have E[DAr] = ^„ A„ = 1. This is eminently reasonable, 
since a distribution centered around 1 and (by construction) vanishing for negative argu- 
ment can only approach a normal distribution if its variance approaches zero. On the other 
hand, the condition A —» is by itself not sufficient, as proven by a counterexample given 
in Appendix B. 

Another piece of insight can be obtained if we allow the eigenvalues to take on ran- 
dom values. We may introduce the rather dizzying concept of an ensemble of different 
definitions of discrepancy, each characterized by its set of eigenvalues (all nonnegative) 
A = {Ai, A2, . . . , Am}, with the usual constraint that they add up to 1; we keep M finite 
for simplicity. A natural probability measure on this ensemble is given by the probability 
density -Pa (A) of the random vector A: 



Pa(A) = r(M)5|^^A„-lj . 



(33) 



Here F denotes Eulers gamma-function. It is easily computed that the expectation and 
variance of Rk = J2n -^n given, for large M, by 

^[Rk\--j^ , y[Rk\ , (34) 
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so that the Rk become sharply peaked around their expectation for large M. In that case, 
we have 



73 - ^ , (35) 

and we see that, in the above sense, almost all discrepancies have a Gaussian distribution 
in the limit where M, the number of modes, approaches infinity. 



3 Applications to different examples 
3.1 Fastest approach to a Gaussian limit 

We now examine the various definitions of discrepancies, and assert their approach to a 
Gaussian limit. Usually this is envisaged, for instance in Ref. []TB[, as the limit where the 
dimensionality s of becomes very large. But, as we have shown, this is only a special 
case of the more general situation where the number of relevant modes becomes very large: 
another possible case is that where, in one dimension, the number of modes with essentially 
equal strength (T„ becomes very large. As an illustration, consider the case where the basis 
functions with the Gaussian measure are orthonormal and M of the nontrivial modes 
have equal strength = 1/M, and the rest have strength zero. The moment-generating 
function then takes on a particularly simple form, and so does the discrepancy distribution 
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,og(G„(.), ^-flog(l-l) , ^ V-/= . (36) 

It is easily seen that the gamma- distribution H{t) approaches a normal one when M 
becomes very large. At the same time, we see the 'physical' reason behind this: it is 
the fact that the singularity of Gq{z) in the complex plane (in the more general case, the 
singularity nearest to z = 0) moves away to infinity. One observation is relevant here: in 
Eq. (p6|), we have kept the integration over u along the imaginary axis Rew = 0. We might 
consider performing a saddle-point integration, with a non- vanishing value of Keu. That 
may give us, for a finite number of modes, a good approximation to the actual form of 
H(t). It is quite possible, and, indeed, it happens in the above equal-strength model, that 
this approximation is already quite similar to a Gaussian. In the equal-strength model, a 
saddle-point approximation for H{t) gives precisely the form of Eq. (|36D , the only difference 
being that r(M/2) is replaced by its Stirling approximation. On the other hand, for not- 
so-large M, this form is not too well approximated by a Gaussian centered around t = 1, 
since the true maximum resides at t = 1 — 2/M. Nevertheless, in this paper we are only 
interested in the limiting behaviour of H{t), and we shall stick to the use of condition ( pT]) 
as an indicator of the Gaussian limit. 

One interesting remaining observation is the following. For any finite number M of 
eigenvalues An (n = 1,2,... ,M), the smallest value of the indicator C = A^/^„A^ is 



9 



obtained when A„ = 1/M for all n. In this sense, the equal-strengths model gives, for finite 
M, that discrepancy distribution that is closest to a Gaussian. 

3.2 L2 star-discrepancy and the Wiener measure 

Here we shall discuss the standard L2 star-discrepancy 0]. We start with a formulation of 
the problem using a continuous variable y on and dfi{y) = dy. The function h is given 
by 

s 

h{y;x,) = l[9{x^,<yn , (37) 

where we have introduced the 6{-) as the logical step-functionQ. The Gaussian function 
measure corresponding to this discrepancy is therefore seen to be defined by 

f f{x^)f{x2)dfi{f) = f[ min(l - 1 - x^) , (38) 

which we can recognize as that variation of the standard Wiener sheet measure in which 
the function /(x) is pinned down at x = (1, 1, . . . , 1) rather than at x = (0, 0, . . . ,0). This 
is the content of the original Wozniakowski lemma from Ref. [0. 

A formulation of this discrepancy in terms of a Gaussian measure on a countable basis 
can be constructed by realizing that a spectral representation of the integration kernel 
g{xi,X2) = n^=i 1^111(^1 ) 2^2) exists and is given by 

g{xi,X2) = '^crlurt{xi)uHix2) , (39) 

n>0 

where the functions are given by 

s 

Urtix) = r/^l[sm{r{n^)^x^) , (40) 
and the strengths a'i by 

(^) ' = (2n+l)^(n>0) . (41) 



Because a Gaussian measure is completely defined by its two-point Green's function, the 
measure defined by the basis functions is equivalent with the Wiener measure. In 



•^The logical step-function 9{P) of an expression P is equal to 1 if the P is true, and if P is false. 
Therefore 0{x < y) is in fact equal to the Heavyside function 9{y — x). 
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Appendix C we show that the discrepancy defined using this formulation of the Gaussian 
measure on a countable basis is equivalent to the L2 star-discrepancy. 
The functions Un are orthonormal, and we have 

Wn = 2"*/^^ and a„^„ = 6^^^ , (42) 

where we introduced the Kronecker symbol Sm,n- The matrix F is given by 

r?fi,n — '^m^fh,n 2 (T^CT^ , (43) 

and an eigenvalue equation for the eigenvalues A can be written down easily: 

. (44) 



Wi^l - A) 



In value the strengths a-fi are degenerate. Labelling the strengths with different values by 
CTp with p = n^=i '"('^m); the degeneracy is given by 

Qw{p) = $^^^(p = nr(n,)j , (45) 

n>0 V M=l / 

SO that A = is solution to the eigenvalue equation with a {Qwip) ~ l)-fold degeneracy. If 
we factorize these solutions we obtain the following equation for the remaining eigenvalues: 

4 

l-rY.Q^{p)^^ = . (46) 

Some assertions concerning the remaining eigenvalues can be made using this equation. 
On inspection, it can be seen that there are no negative solutions, nor solutions larger 
than cr^, so that (j\ can be used as an upper bound of the eigenvalues of F. If we order 
the A such that Ai > A3 > . . . , then o"^ > Ai > > A3 > . . . . This implies that 
Tr (F^) = Qw{p) (yf - e where < e < ap. Note that Y.p Qw{p) erf = Tr (^^) so that 



traces of (7^ are upper bounds of traces of F'^. Now we have 



(2n + 1] 

n>0 ^ ' 



and therefore for A; > 3: 



/^(2A:)^W. _/4\^ /2 



(47) 



The second factor decreases monotonically from (15)*^ for s = 1 to one as s — 00; for 
the first factor, we note that 1 < ^(2fc) < ^(4) for all > 2. Therefore 7^ can be made 
arbitrarily small by choosing s large enough, and the Gaussian limit of high dimensionality 
is proven. Note, however, that the approach is not particularly fast: for large s, we have 
73 ~ (24/25)'^ ~ exp(— s/25), so that s has to become of the order of one hundred or 
so to make the Gaussian behaviour manifest. In fact, this was already noted by explicit 
numerical computation in Ref. 
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3.3 Diaphony 

3.3.1 General definition 

In one dimension the discrepancy defined through a Gaussian measure on a countable basis 
is called diaphony if the basis functions are such that 

Wn ^ and a^,„ = . (49) 

These relations are typically satisfied when the functions are orthonormal and uo{x) — 1 
is one of the basis functions. The matrix F is given by 

SO the eigenvalues are given by the squares cr^ of the strengths itself. An extension to more 
dimensions can be obtained by taking products Ufi{x) = n^=i '^n^{x^) of one dimensional 
functions. However, in contrast to the Wiener sheet measure that underlies the L2 star- 
discrepancy, there appears to be no 'natural' generalization of the strengths (T„ to more 
dimensions, and therefore wc shall discuss various possibilities. In general, we want to let 
the strength cTfj depend on a global property of the vector n, for instance, the product of 
the components, or the sum of the components: we shall call such alternatives clusterings. 

3.3.2 Fourier diaphony 

As an application of the above, let us consider the orthonormal functions defined by the 
one-dimensional factors 

U2k-i{x) = -\/2 sin(27rfca;) , U2k{x) = \/2cos{27Tkx) , fc = 1,2,3,... . (51) 

Furthermore, it is useful to take the o"^ such that the sine and cosine modes with equal 
wavenumbcr appear with equal coefficients. Let us define 

k{n) = k9{2k-l<n<2k) . (52) 
We require that only depends on n via k{n): 

an = a [k{n)^ , k{n) = {k{ni), k{n2), . . . , k{ns)) . (53) 
In that case, the diaphony is equal to 

N 



Dn = ^^a'^{\ki\,\k2\,... ,\ks 



N 



E 



e ' 



(54) 



where, this time, the vector k runs over the whole integer lattice except the origin; and it 
has the appealing property that the value of the Fourier discrepancy is the same for point 
sets differing only by a translation mod 1; the L2 star-discrepancy does not have this nice 
property. 
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3.3.3 Fourier diaphony with product clustering 

One of the most straightforward generahzations of the Fourier diaphony, and the choice 
made in Ref. 0, is to let o"rj depend on the product of the frequency components: 



= n -u ^^^s 1 n ' = 0in = 0) + kin)9in>0) . (55) 

1 + 7r^/3r — 1 r(n,,) 



The normalization of the ensures that E [Dn] = 1, independent of s. In this case, 
keeping in mind that sines and cosines occur with equal strength, we have to consider the 
multiplicity function 

n>0 \ ^J, J 

Actually, before assigning a strength a^, or rather a^, we have to know the behaviour 
of Q^{p) in order to ensure convergence of E[Djv]- In order to do so, we introduce the 
Dirichlet generating function for Q^{p): 

where we use the Riemann ( function. Since this function (and, therefore, Fs^\x) as well), 
converges for all a; > 1, we are ensured that Q^{p) exceeds the value cp^"*"" at most for a 
finite number of values of p, for all positive c and e. This is proven in Appendix D. It is 
therefore sufficient that decreases as a power (larger than 1) of p. In fact, taking 

al = cp-^ , /? > 1 , (58) 
we immediately have that 

= E^n^' = E^"(p)^f = c'^iii+nmr-i] , (59) 

n>0 p>0 

which, for given /3, fixes c such that Ri = E[Dn-] = 1, and, moreover, gives 

73~a(^)^ as s^oo , a((3) = (^ + ^^(3/^)) _ .g^. 

(l + 2C(2/3))-' ^ ' 

As indicated above, in Ref. [Q] the value /5 = 2 is used, with a(2) ~ 0.291. The supremum 
of a{P) equals 1/3, as /? — >• cx), and the (more interesting) infimum is a(l), about 0.147. 
We conclude that, for all diaphonies of the above type, the Gaussian limit appears for 
high dimensionality. For large /?, where the higher modes are greatly suppressed, the 
convergence is slowest, in accordance with the observation that the 'equal-strength' model 
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gives the fastest convergence; however, the convergence is still much faster than for the 
L2 star-discrepancy, and the Gaussian approximation is already quite good for s ~ 4. 
The fastest approach to the Gaussian limit occurs when we force all modes to have as 
equal a strength as is possible within the constraints on the j3. The difference between the 
supremum and infimum of a(/3) is, however, not much more than a factor of 2. 

Another possibility would be to let (t^ depend exponentially on p. In that way one can 
ensure convergence of the while at the same time enhancing as many low-frequency 
modes as possible. It is proven in Appendix D that the function 

(x) = 5^gn(^)^P (61) 

p>Q 

has radius of convergence equal to one, and therefore we may take = {(3'Y with (3' 
between zero and one. If we choose (3' to be very small, we essentially keep only the modes 
with p = 1, and therefore in that case we have 73 ~ 1/(3^ — 1). This is of course in reality 
the same type of discrepancy as the above one, with f3 00. On the other hand, taking 
/?' — > 1 we arrive at 73 (see, again. Appendix D). The difference with the first model 
is, then, that we can approach the Gaussian limit arbitrarily fast, at the price, of course, of 
having a function P{xk, Xi) that is indistinguishable from a Dirac 5-distribution in Xk — Xi, 
and hence meaningless for practical purposes. 



3.3.4 Fourier diaphony with sum clustering 

In the above, we have let the strength cTfi depend on the product of the various r{n^). This 
can be seen as mainly a matter of expediency, since the generalization to s > 1 is quite 
simple in that case. From a more 'physical' point of view, however, this grouping of the 
(T is not so attractive, if we keep in mind that each n corresponds to a mode with wave 
vector k{n). Under the product rule, wave vectors differing only in their direction but 

— * 

with equal length may acquire vastly different weights: for instance, k — {m^/s, 0,0,...) 

— * 

and k = (m, m,m, . . .) have equal Euclidean length, m\/s, but their strengths under the 
product rule are l/(sm^) and l/(m^*), respectively. This lack of 'rotational' symmetry 
could be viewed as a drawback in a discrepancy distinguished by its nice 'translational' 
symmetry. One may attempt to soften this problem by grouping the strengths in another 
way, for instance by taking 

<Jn = a(^k{n^)j , (62) 

so that (7 depends on the sum of the components rather than on their product. The 
multiplicity of a given strength now becomes, in fact, somewhat simpler: 

«?w = 5:«(.-tM'.))^E(;)C-:!r1' 

n>0 \ /i=l / m>0 \ / \ ^ / 
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where the last identity follows from the generating function 



p>0 ^ 



(64) 



This also immediately suggests the most natural form for the strength: cr^ = P^, where p 
is Ylix^{''^ij) ^ above. We see that Ri converges as long as /3 < 1, and moreover, 



73 = 



1+^ 
1-/32 



(65) 



where a(/3) has supremum a(0) = 1, and decreases monotonically with increasing /3. For 
f3 close to one, we have a{f3) ~ 4(1 — f3)/9, so that the Gaussian limit can be reached as 
quickly as desired (again with the reservations mentioned above). At the other extreme, 
note that for very small /3 we shall have 



73 



1 

2i 



if < 1 



(66) 



This just reflects the fact that, for extremely small /?, only the 2s lowest nontrivial modes 
contribute to the discrepancy; and even in that case the Gaussian limit is attained, although 
much more slowly. The criterium that determines whether the behaviour of 73 with s and 
P is exponential or of type l/(2s) is seen to be whether is considered to be large or 
small, respectively. 

Another alternative might be a power-law-like behaviour of the strengths, such as cr| = 
1/p". Also in this case we may compute the Rk, as follows: 



00 



(67) 



from which it follows that « > s to ensure convergence of E [Dn\- In the large-s limit, we 
therefore find that, also in this case, 73 l/(2s). 



3.3.5 Fourier diaphony with spherical clustering 

A clustering choice which is, at least in principle, even more attractive from the symmetry 
point of view than sum clustering, is to let cr^ depend on |/c(n)p, hence assuring the 
maximum possible amount of rotational invariance under the constraint of translational 
invariance. We therefore consider the choice 
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For the function (3{xi,X2) — P{xi — X2) we now have the following two alternative forms, 
related by Poisson summation: 



s / +00 \ 

P{x) = -1 + n "^'"^^ cos{2nkx>') 



fJ.=l \k=—co 

of which the first converges well for large, and the second for small, values of a; the sum 
over m extends over the whole integer lattice. The Rk are, similarly, given by 
+00 ^ ^ 



ft = E 



1 



\q=—oo 



\m=— 00 / 

For large a (where, again, only the first few modes really contribute) we recover, again, 
the limit 73 l/(2s) as s — > 00: for small a we have, again, an exponential approach to 
the Gaussian limit: 

73 ~ I — I as s — > 00 . (71) 



^97r J 

The distinction between the two limiting behaviours is now the magnitude of the quantity 
sexp(— 2q;), which now takes over the role of the of the previous paragraph. 

3.3.6 Walsh diaphony 

Another type of diaphony is based on Walsh functions, which are defined as follows. Let, 
in one dimension, the real number x be given by the decomposition 

X = 2-^xi + 2-'^X2 + 2^^X3 + ■■■ , e {0, 1} , (72) 
and let the nonnegative integer n be given by the decomposition 

n = ni + 2712 + 2^713 + 2^714 H , 71^ e {0, 1} . (73) 

Then, the n^^ Walsh function W„(x) is defined as 

The extension to the multidimensional case is of course straightforward, and it is easily 
seen that the Walsh functions form an orthonormal set. The Walsh diaphony is then given 

by 

= ^E'^n-(EW^n(x,)) . (75) 
n>0 \A;=1 / 
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In Ref. |T3], the following choice is made: 



r(n) = ^(n = 0) + ^(r2 > 0)^2^0 (2^ < n < 2*^+^) . (76) 

Note that, in contrast to the Fourier case where each mode of frequency n contains two basis 
functions (one sine and one cosine), the natural requirement of 'translational invariance' 
in this case requires that the Walsh functions from TP up to TP^^ get equal strength. The 
clusterings are therefore quite different from the Fourier case. We slightly generalize the 



notions of Ref. 131, and write 



fl— , 

A^Ar(n^)2 

r{n) = e{n = 0) + e{n > 0) ^ {ajS^)'^'^ 9(2^ <n< 2^'+^) . (77) 

Here, we have disregarded the overall normalization of the cr's since it does not influence 
the Gaussian limit. It is an easy matter to compute the Rk', we find 



n>0 



SO that the requirement E[Df^] = Ri < oo implies that we must have (3 < 1/2. Therefore, 
for not too small values of a, we have 

73 ~ a(a,py . a(a,p) = \, ^ J^^, _ ,;,\>. ■ (79) 

The choice made in Ref. [0] corresponds to a = 1 and (3 = 1/4, for which we find 
a(l,l/4) ~ 0.4197. The Gaussian limit should, therefore, be a good approximation for 
s larger than 6 or so. An interesting observation is that for fixed jS, a{a,(3) attains a 
minimum at a = (1 — 2(3^) /{I — 2/3^), so that the choice (3 = 1/4 could in principle lead 
to a(31/28, 1/4) = 0.4165 with a marginally faster approach to the Gaussian. The overall 
infimum is seen to be a(3/2, 1/2) = 2/11 ~ 0.182. As in the Fourier case with product 
clustering and a power-law strength, there is a limit on the speed with which the Gaussian 
is approached: in both cases this is directly related to the type of clustering. 
At the other extreme, for very small a we find the limiting behaviour 

(1 - 2(3y 1,2 

73-^^3^- if sa^«l. (80) 

Again in this case, the slowest possible approach to the Gaussian limit is like 1/s, directly 
related to the symmetry of the discrepancy definition with respect to the various coordinate 
axes. 
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3.4 Lego discrepancy 

Another class of integrands and discrepancies can be constructed by dissecting the hyper- 
cube K into M non-overlapping bins Am (m = 1, 2, . . . , M), and taking the characteristic 
functions dm of the bins as the basis functions of the measure. Then w„ is the volume of 
Am, and 



M 

E 

m=l 



Wm — 1 and (lm,n — WnSm,n 



Note that in this case n runs over a finite set of values. Moreover, this model is dimension- 
independent, in the sense that the only information on the dimension of K is that contained 
in the value of M: if the dissection of K into bins A^ is of the hyper-cubic type with p bins 
along each axis, then we shall have M = . Also, a general area-preserving mapping of K 
onto itself, such as the Arnol'd cat-transform, will leave the definition of the discrepancy 
invariant in the sense that it will lead to a distortion (and possibly a dissection) of the 
various bins Am, but this influences neither Wm nor (by definition) am- Owing to the 
finiteness of M, a finite point set can, in fact, have zero discrepancy in this case, namely if 
every bin Am contains precisely WmN points (assuming this number to be integer for every 
m). 

The matrix Tm,n has now indices that label the bins (m, n = 1,2, . . . M), where M is 
the total number of bins: 

We shall now examine under what circumstances the criterion ( pT]) for the appearance of 
the Gaussian limit is fulfilled. The eigenvalues Aj of the matrix Tm,n are, of course, given 
as the roots of the eigenvalue equation 



M 



\m=l 




. (83) 



It is seen that there is always one zero eigenvalue (the corresponding eigenvector has 1 / am 
for its m^^ component). Furthermore the eigenvalues are bounded by ma.XmicTm'^m), and 
this bound is an eigenvalue if there is more than one m for which the maximum is attained. 
At any rate, we have for our criterion, that 

^ ^ ^ max^(a^w^)^ ^ max^(a^w^)^ 

EA? - Tr(r2) E<wi{i-2wm) + {E^ly^mr ■ ^ ' 

i mm 

Since the generality of the Lego discrepancy allows us to choose from a multitude of pos- 
sibilities for the cr's and w^s, we now concentrate on a few special cases. 
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1. All Wm equal. This models integrands whose local details are not resolved within 
areas smaller than 1/M, but whose magnitude may fluctuate. In that case, we have 

n / 1 (max^ g^)^ 

^ < 1-2/M ' ^ ^ 

n 

and a sufficient condition for the Gaussian limit is for this bound to approach zero. 
Note that here, as in the general case, only bins m with am 7^ contribute to the 
discrepancy as well as to the criterion C, so that one has to be careful with models 
in which the integrand is fixed at zero in a large part of the integration region K: 
this type of model was, for instance, examined in Ref. Wi . 



2. All am equal. In this case, the underlying integrands have more or less bounded 
magnitude, but show finer detail in some places (with small w) than in other places 
(with larger w). Now, it is simple to prove that 

C < 7— , w = maxwm , (86) 

so that a sufficient condition is that Mw'^ should approach zero. 

3. All amWm equal. This choice models functions in which the largest fluctuations 
appear over the smallest intervals. Although not a priori attractive in many cases, 
this choice is actually quite appropriate for, e.g. particle physics where cross sections 
display precisely this kind of behaviour. In this case we simply have 

^ " (M + 2)(M-1) ' ^^'^^ 
and the Gaussian limit follows whenever M ^ 00. 



4 Conclusions 

We have shown that a large class of discrepancies, including the L2 star-discrepancy and 
the diaphonies, can be formulated as the induced discrepancy of a class of functions defined 
by a countable set of basis functions. These basis functions we called modes. For such a 
discrepancy we derived the probability distribution, in the limit of a large number of points, 
over the ensemble of truly random point-sets. We have shown under what conditions this 
distribution tends to a Gaussian. In particular, the question of the limiting behaviour of a 
given distribution can be reduced to solving an eigenvalue problem. Using the knowledge of 
the eigenvalues for a given function class it is possible to determine under which conditions 
and how fast the Gaussian limit is approached. Finally, we have investigated the limiting 
behaviour of the probability distribution for the discrepancy of several function classes 
explicitly. 
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The discrepancy that most rapidly approaches the Gaussian hmit occurs for models in 
which the number of modes with non-zero equal strength goes to infinity, while the sum of 
the strengths is fixed. In fact, we give an argument why we cannot improve much on this 
limit. However, a drawback of this model is that the discrepancy itself becomes a sum of 
Dirac 5-functions in this limit: it only measures whether points in Xat coincide or not, and 
is therefore not very useful in practice. 

Secondly, we have examined the L2 star-discrepancy. Here a Gaussian distribution 
appears in the limit of a large number of dimensions. It is however a very slow limit: only 
when the number of dimensions becomes of the order O (10^) does the Gaussian behaviour 
become manifest. 

For the various diaphonies, the choice of the mode-strengths is more arbitrary. The 
strengths we discuss are chosen on the basis of some preferred global properties of the 
diaphony, such as translation- and/or rotation-invariance. Again for large dimensions the 
Gaussian limit is attained, either as a power-law or inverse of the number of dimension. 
It is possible to choose the strengths in such a way that the Gaussian limit is approached 
arbitrarily fast. But the diaphony corresponding to that case again consists of a sum of 
Dirac 5-functions. 

Finally, for the Lego-discrepancy, we can assign strengths to the different modes in 
several ways. One possibility is to keep the product of the squared strength and volume of 
the modes fixed: then, the Gaussian limit is reached for a large number of modes. 

All these results have been derived in the limit of large number of points. It remains to 
be seen however whether this is reasonable in practice. To determine when the asymptotic 
regime sets in, i.e. for which value of N , it is necessary to take into account the next-to- 
leading contributions. This will be the subject of Ref. [jl5|. 



Appendix A: The form of Gq[z) 

In this Appendix, we derive the result ( p4|) for the form of Gq{z) in terms of the quantities 
A and B of Eq. (^). For simplicity of notation, we shall assume the discrete case where the 
Am,n is a matrix, and the B^ a vector; the indices m, n are then what we called the variables 
y in the foregoing. Moreover, let us denote by [BA^B] the sum „ BjYijA^)^ fiB,fi. Since 
the matrix Tm,n can be written as 

the k^^ power of this matrix has the general form 

(r^)^,„ = {A'Un- E ^^'f 7^' iA^BUiBA% l[i-[BA^B]y^ , (89) 

Pii3:'^0,i,2,...>0 r>0 

with the constraint k — 1 = p + q + uq + 2vi + 3^2 + ■ ■ ■ . The combinatorial factor follows 
directly from the possible positionings of the dyadic factors —BmBn- Multiplying by (2^)^^"^ 
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and summing over the k then gives us immediately 

— n>l — 

(90) 

where the last factor, with r + 1, comes from the double sum over p and q with p + g = r. 
Upon integration of this result over t from to 2 we find 

log(Go(^)) = E^Tr(n 

n>0 



E ^Tr (A") - i log (1 + J2i^zr[BA-'B]\ 

n>0 \ n>0 / 



(91) 



This result has, in fact, already been obtained for the case of the L2 star-discrepancy in 
Ref . , but here we demonstrate its general validity for more general discrepancy measures. 
In those cases where = 0, the second term of course vanishes. 

Appendix B: A counterexample 

In this Appendix we prove that the condition ( PT| ) for the occurrence of a Gaussian limit 
is, in a sense, the best possible. Namely, consider a set of eigenvalues A„, again adding up 
to unity as usual, defined as follows: 

Ai = A , 

A„ = (1-A)/(M-1) , 71 = 2,3,... ,M , 

A„ = , n> M . (92) 

Clearly, A will indeed be the maximal eigenvalue as long as M > 1/A. Now, 

A2 A^ 



EnK A2 + (1-A)V(M-1) 



(93) 



and this ratio can be driven as close to unity as desired by choosing M sufficiently large. 
This shows that the simple condition A — > is not always enough to ensure the Gaussian 
limit. 

Appendix C: Spectral representation of the L2 star- 
discrepancy 



Mercer's theorem||T^ states that a nonnegative-definite and continuous function on (1, 0] x 



'1,0] has a spectral decomposition. Applying this to the function min(x5', 0:2), then tells 
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us that the two-point connected Green's function g of the Wiener measure has a spectral 
decomposition of Eq. (|39|) . The eigenvalues a'i and eigenfunctions for g are given by 
Eq. (|T]) and Eq. {^). 

To show that the discrepancy defined through the functions ua is the same as the L2 
star-discrepancy pinned down at (1, 1, . . . ,1), we prove the equality of the /3-functions for 
the two measures: 

n 

(s s \ / s s \ 

n ^« -yn-U^i-yn][U d{A -yn-\{{^- yn dy . (94) 

Evaluating both sides of the equation we obtain: 

n 

= 9{xux2) - n K - iK)') - n (^2 - K^2)') + (0' • (95) 

The first terms on both sides of the equation cancel trivially. A small calculation shows 
that the same applies to the last terms on both sides of the equation. It thus remains to 
show that 

s 

X;2^/V|«,(x) = n(^"-i(^")') • (96) 

This problem again factorizes for the different coordinates (omitting indices): 
/2\^ 00 

2 [-) E (^;^ ((2^ + l)f x) = X - Ix^ , (97) 



which is nothing but stating that the Ihs of Eq. (^) is the Fourier decomposition of the 
rhs. To prove this, let / be the following periodic extension of x — |x^ : 

lx2-3x + 4 xG(2m-2,2m] 

G (2m, 2m + 2] ' ' 

where m is any integer. The function / is a parabolic approximation of |sin(47rx). It is 
continuous and differentiable on R. Hence it can be written as a Fourier series, based on 
a period of 4 rather than 1. An explicit calculation shows that the only non-zero terms 
comes from the functions sin ((2n + l)f x) (n = 0, 1, 2, ... ). The Fourier coefficients 
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are given by Q 

i= fix) sin ((2n + rfx = 2v^ 0) ^^-1^ . (99) 

Thus the Fourier series is exactly given by the Ihs of Eq. (^7^ . 

Appendix D: The magnitude of Q^{p) 

Here we present the proofs of our various statements about the multiphcity function (p) 
of section |3.3.3| . In the first place, we know that its Dirichlet generating function, F^^\x), 



converges for all x > 1. Now suppose that Q^{p) exceeded cp" an infinite number of 
times, with c > and a > 1. The Dirichlet generating function would then contain an 
infinite number of terms all larger than c, for 1 < x < a, and therefore would diverge, in 
contradiction with its convergence for all x > 1. 

(2) 

In the second place, consider the 'standard' generating function, Fs (x). By inspecting 
how many of the vector components of n are zero, we see that we may write, for p > 1, 



t=l ^ ^ n>0 V At=l 



nj , (100) 



so that dt{p) counts in how many ways the integer p can be written as a product of t 
factors, including ones; this function is discussed, for instance, in Ref. Now, for p 

prime, we have dt{p) = t, and therefore 

Q^ip) > 2s(3*^^) , equality for p prime . (101) 

(2) 

The radius of convergence of Fs (x) is therefore at most equal to unity. On the other 
hand, we can obtain a very crude, but sufficient, upper bound on Q^{p) as follows. Since 
dt{p) is a nondecreasing function of t, we may bound (p) by (3"^ — l)ds{p). Now let kp be 
the number of prime factors in p; then kp cannot exceed log(p)/log(2), and only is equal 
to this when p is a pure power of 2. Also, the number of ways to distribute k object in s 
groups (which may be empty) is at most s^, and is smaller if some of the objects are equal. 
Therefore, ds{p) is at most s'^'', and we see that 

Q^{p) < (3-_i)pi°gW/i°g{2) ^ (102) 
or, in short, is boundedFI by a polynomial in p. Therefore, the radius of convergence of 



Fs (x) is also at least unity, and we have proven the assertion in Eq. 3.3.3 



"^We take the functions normalized such that they form a orthonormal set on (0, 4], so the Fourier series 
is in terms of the sine- and cosine functions divided by V2- 

^Note that equahty cannot occur in this case since the two requirements are mutually exclusive. 
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Finally, we consider the limit 




(103) 



The same reasoning that led us to the radius of convergence shows that, for x approaching 
1 from below, the function F^^\x) behaves as (1 — x)~'^^ with c > 1. Therefore, 73 will 
behave as (8(1 — ,t)/9)^, and approach zero as x — > 1. Note that the upper bound on Q^{p) 
is extremely loose: but it is enough. 
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